Picture for Jason Kuen

Jason Kuen

FLARE: Diffusion for Hybrid Language Model

Add code
Jun 01, 2026
Viaarxiv icon

Inline Critic Steers Image Editing

Add code
May 12, 2026
Viaarxiv icon

DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation

Add code
Mar 20, 2026
Viaarxiv icon

ViT-AdaLA: Adapting Vision Transformers with Linear Attention

Add code
Mar 17, 2026
Viaarxiv icon

SNCE: Geometry-Aware Supervision for Scalable Discrete Image Generation

Add code
Mar 16, 2026
Viaarxiv icon

LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models

Add code
Feb 15, 2026
Viaarxiv icon

Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models

Add code
Dec 16, 2025
Viaarxiv icon

VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction

Add code
Dec 11, 2025
Figure 1 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Figure 2 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Figure 3 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Figure 4 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Viaarxiv icon

OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive

Add code
Nov 14, 2025
Viaarxiv icon

Image Tokenizer Needs Post-Training

Add code
Sep 15, 2025
Viaarxiv icon